Text-independent voice conversion using speaker model alignment method from non-parallel speech

نویسندگان

  • Peng Song
  • Yun Jin
  • Wenming Zheng
  • Li Zhao
چکیده

In this paper, we propose a novel voice conversion method called speaker model alignment (SMA), which does not require parallel training speech. Firstly, the source and target speaker models, described by Gaussian mixture model (GMM), are trained, respectively. Then, the transformation function of spectral features is learned by aligning the components of source and target speaker models iteratively. Additionally, the transformation function is further combined with GMM, enabling the multiple local mappings, and a local consistent GMM (LCGMM) is also considered for model training to improve the conversion accuracy. Finally, we carry out experiments to evaluate the performance of the proposed method. Objective and subjective experimental results demonstrate that compared with the wellknown INCA approach, the proposed method achieves lower spectral distortions and higher correlations, and obtains a significant improvement in perceptual quality and similarity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

Speaker-adaptive-trainable Boltzmann machine and its application to non-parallel voice conversion

In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model. Voice conversion is a technique where only speaker-specific information in the source speech is converted while keeping the phonological information unchanged. Most of the existing VC methods rely on parallel data—pairs of speech data from the source and target speakers utterin...

متن کامل

Voice Conversion Using Articulatory Features

The aim of voice conversion is to transform an utterance spoken by an arbitrary (source) speaker to that of a specific (target) speaker. Text-to-speech (TTS), speech-to-speech translation, mimicry generation and human-machine interaction systems are among the numerous applications which can be greatly benefited by having a voice conversion module. Generally voice conversion systems require para...

متن کامل

Text and speaker independent voice conversion

This paper describes an approach to the challenging problem of text and speaker independent voice conversion. The approach is based on target speaker’s speech production process parameterization using harmonic analysis. Unified model allows processing of any input speech regardless of its content and source speaker. The method provides subjective quality of conversion that is comparable with te...

متن کامل

Text-independent cross-language voice conversion

So far, cross-language voice conversion requires at least one bilingual speaker and parallel speech data to perform the training. This paper shows how these obstacles can be overcome by means of a recently presented text-independent training method based on unit selection. The new method is evaluated in the framework of the European speech-to-speech translation project TC-Star and achieves a pe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014